Processing and Linking Audio Events in Large Multimedia Archives: The EU inEvent Project

نویسندگان

Hervé Bourlard

Marc Ferras

Nikolaos Pappas

Andrei Popescu-Belis

Steve Renals

Fergus R. McInnes

Peter Bell

Sandy Ingram

Maël Guillemot

چکیده

In the inEvent EU project [1], we aim at structuring, retrieving, and sharing large archives of networked, and dynamically changing, multimedia recordings, mainly consisting of meetings, videoconferences, and lectures. More specifically, we are developing an integrated system that performs audiovisual processing of multimedia recordings, and labels them in terms of interconnected “hyper-events” (a notion inspired from hyper-texts). Each hyper-event is composed of simpler facets, including audio-video recordings and metadata, which are then easier to search, retrieve and share. In the present paper, we mainly cover the audio processing aspects of the system, including speech recognition, speaker diarization and linking (across recordings), the use of these features for hyper-event indexing and recommendation, and the search portal. We present initial results for feature extraction from lecture recordings using the TED talks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Proceedings of the First Workshop on Speech , Language and Audio in Multimedia ( SLAM 2013 ) Marseille

متن کامل

iii Table of Contents Multimedia Semantic Analysis in the PrestoSpace Project

PrestoSpace is a European-funded research project that aims at addressing the problem of decaying audio-visual archives throughout Europe by means of digitisation for preservation and access. One of the work areas within the project is Metadata Access and Delivery (MAD) which employs innovative methods of generating metadata for the digitised media in order to enhance the resulting archives and...

متن کامل

Toward Generic Intelligent Knowledge Extraction from Video and Audio: the Eu-funded Caretaker Project

The CARETAKER1 project, which is a 30-month project that has just kicked off, aims at studying, developing and assessing multimedia knowledge-based content analysis, knowledge extraction components, and metadata management sub-systems in the context of automated situation awareness, diagnosis and decision support. More precisely, CARETAKER will focus on the extraction of structured knowledge fr...

متن کامل

Cipher text only attack on speech time scrambling systems using correction of audio spectrogram

Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...

متن کامل

Linking Video and Text via Representations of Narrative

The ongoing TIWO project is investigating the synthesis of language technologies, like information extraction and corpus-based text analysis, video data modeling and knowledge representation. The aim is to develop a computational account of how video and text can be integrated by representations of narrative in multimedia systems. The multimedia domain is that of film and audio description – an...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Processing and Linking Audio Events in Large Multimedia Archives: The EU inEvent Project

نویسندگان

چکیده

منابع مشابه

Proceedings of the First Workshop on Speech , Language and Audio in Multimedia ( SLAM 2013 ) Marseille

iii Table of Contents Multimedia Semantic Analysis in the PrestoSpace Project

Toward Generic Intelligent Knowledge Extraction from Video and Audio: the Eu-funded Caretaker Project

Cipher text only attack on speech time scrambling systems using correction of audio spectrogram

Linking Video and Text via Representations of Narrative

عنوان ژورنال:

اشتراک گذاری